DOCOBBIT BZSOBB 

la 003 487 

Indersonr ScarTia; And Others 
Priorities and Directions for ficsearch and 
DeTelopient Belated to Beasure»€nt of lonng Children: 
Report on Task 2. 

Educational Testing Service, Princeton, H*J* 
Office of Child Defelopient (OBEi) , Washington, 
D*C. 

ETS-PE-72-22 
Oct 72 
37p. 

H?-$0*75 nc-$1*85 

Action Research; ♦Children; Cultural Factors; 
EnTironaental Influences; Kindergarten Children; 
*Heas|^e«ent ; Heasureient Technigues; Preschool 
Childiln; ^Research; Research Otilization; Social 
Influences; ♦Testing; Testing Problejis; Test 
Validity 



A panel of 15 experts in child deTelopient, early 
childhood education and aeasureient set in Septeiber 1972 to a«ciat 
the Office of Child Developaent in establishing priorities in 
improving tests and leasureaents for young children* A sutiarj of the 
panel discussion is presented along with the specific recoiieadations 
■ade by the participants* The key issues under consideration were: 
(1) the special statistical and ■ thodological probleis of aeasuring 
the behavior of young children and the impact of their environaents 
because of the limitad response systea of young children and the 
rapid changes that occur in early life; (2) the considerations of 
construct-based aaarureaent , particularly the problems of population 
and ecological validity that are inherent in the use of seasures ¥ith 
different cultural groups; and (3) the dependency of the advancement 
of measurement research and development on appropviate policy 
decisions, and the availability and training of manpower. (EH) 



2T) 088 927 

AOTHOB 
TITLB 



ISSTITOTIOH 
SP09S AGEHC? 

REPORT HO 
POB DATE 
HOTS 

BDBS PBICE 
DESCRIPTORS 



ABSTRACT 



ERIC 



icort iimnMT Momct 

TM ERIC FtalitY h« 



K9 



lnouriv>|»««nt,thk « 

points o4 vl^M. 



m 



PRIORITIES AND DIKEC7iaeC£ FCr. 
RESEARCH AND DEVELOPMENT RELATED TC 
MEASUREMENT OF YOUNG CHILDREN 



Report on Task 2 under 
OCD Grant Number H-2993 A/H/0 



A 




0ctx>ber 1972 



flOUCATIOrCAU THSYINGI SHnVlCfl 

pniMUCTOfsi, NSW jsnaw 



GO 

oo 
o 



us Oi^ARTMiMTO^ MtALTM 
COUCATtON 4W«t.rAII| 
MAtlOMALIMlTirJTlOF 
■ OUOTION 
THIS DOCUMENT HAS »CCN ItPOO 
OUCCO EXACTLY AS RCCEi^EO PROM 
TmE F iftSON 0« ORGANIZATION ORlGiN 
ATIXG.T PO'NTS O*^ VteW O-* OP'N.ONS 
STATED OO NOT NECESSARILY P^PRE 
SENT OFUC'AL NATIONAL INSTITUTE OP 
EO'JCATiOH POSiTiON OR POLICY 



;^«M.SSiON TO REPOOOUCE THIS COPt 
*QHfei^MArC«IAL^HAS McN GRANTED BY 

^^Tlli ^ EVOCATION ,URTH6R 
^1r '^'^"^•^ CO^-CHT 




Priorities and Directions fov Research and Development 
Related to Measurement of Young Children 



Report on Task 2 
under 

(X:D Grant Number H-2993 A/H/0 



ERLC 



This report presents a synthesis of the results of a panel discussion held 
in Princeton on September 19-21, 1972, and addressed to assisting the Office of 
Child Development in establishing priorities in improving the field of tests 
and measurements in early childhood • The panel members were: 



Ernest M. Bernal, Jr. 
Director, Bilingual Early Elementary 
Program 

Southwest Educational Development 
Laboratory 

Courtney Cazden 
Professor of Education 
Harvard University 

Edgar G. Epps 

Professor of Urban Education 
University of Chicago 

Susan W. Gray 
Professor of Psychology 
George Peabody College for 
Teachers 



Henry J. Mark 

Associate Professor of Pediatrics, 
Otolaryngology, Environmental 
Medicine 

Johns Hopkins Hospital 

Virginia Shipman 

Senior Research Psychologist 

Educational Testing Service 

Irving E. Sigel 
Professor of Psychology 
State University of New York at 
Buffalo 

Hentan Witkin 

Senior Research Psychologist 
Educational Testing Service 



Others attending the meeting were Esther Kresh, Office of Child Develop- 
ment, and Scarvia Anderson, Samuel Ball, Ruth Ekstr<>m, Nathaniel Hartahorne, 
Ann Jungeblut, and Samuel Messick of Educational Testing Service. 



The report was prepared by Scarvia Anderson, Samuel Messick, and Nathaniel 
Hartshorne. 



Contents 



Page 

Science is measurement * ^ 
Improvement of measures begins with improvement in conceptualization. 2 

Conclusions about children and their environments will stand or fall 
on the basis of the adequacy with which major variables are 
assessed* ^ 



Assessment of environments is important to understanding the fxinctions 
of individuals • 

Action systems are needed for .the effective utilization of research 
and measurement in educational and social applications* 

Effective measurement development, research, administration, and 

utilization are dependent on availability and training of manpower. 



The advancement of measurement resear":h and development in early 
childhood depends upon appropriate policy decisions* 

I 

Summary of Recommendations. 

Appendix: Specific Panel Rtfjcomnendations* 



iii 



ERLC 



Science Is Measurement 



The basic task of measxirement is the generic task of all science- the mar- 
shaling of evidence to support an inferential leap from an observed consistency 
in the empirical world to a construct that will explain that consistency. In 
psychometric parlance^ this is the problem of construct validity. In judging 
the adeqxiacy of measurement^ there are important statistical and methodological 
criteria (such as validity and reliability) that must be satisfied, but these 
are simply part of the central requirement of a theoretical rationale. 

Measurement of young children and their environments presents some special 
challenges to the statistica.^ and methodological criteria because of the 
limited response system of the young child and the very rapid changes that 
occur in early life. However, these problems must not ?idetrack the investi- 
gator from basic theoretical inquiry into the nature of child development and 
educational functioning — an inquiry in which measiarement can play a central and 
organising role. 

This paper, rhtrefore/ is primarily concerned with considerations of con-- 
struct-based measurement, particularly with the problems of population and 
ecological validity that are inherent in the use of meas\^ef with different, 
cultural groups. 



2 



Improvement of Measures Begins with Improvement in Concentualization 



Measurement development pursued as part of a theoretical framework instead of 
on an ad hoc basis permits one to (a) evaluate the adeqxiacy of the measurement 
in terms of tb"* meajTiinq of the construct, (b) consider individual score differ- 
ences as representing more or less of the trait measured, amd (c) compare and 
integrate results across studies in tejcms of coimoon constructs* 

If we eventually want to use measurement for practical purposes such as 
diagnosis and evaluation, we must be prepared to justify that use in terms of 
social consequences, and these cannot evaluated without information about 
the meaning of the measure. No accumulation of sterile statistics can compen- 
sate for lack of understanding. 

Multiple Measures amd Multiple DomsiiXim 

The meaning of a measure is interpreted or evolves froci its pattern of relation- 
ships with other theoretically relevant measures (convergent validity) and its 
lack of relationship with theoretically unrelated measures (discriminant valid- 
ity) ♦ Therefore, research and development on measurement must be multivariate 
in nature. This is a general principle of all meastirement — physical, environ- 
mental, and sociological as well as psychological- It implies that to explain 
the meaning of a measure in full, it is important to examine its operation in 
domains other than the one from which it derives. The investigator interested 
in a psychological variable such as field independence, for example, would be 
interested in how it operates in many sitxaations and across different cultural 
groups. This introduces the notions of population validity (the extent to which 
the meaning of a measure — or the results of an experiment — will generalize to 
other population groups) , ecological validity (the extent to which generaliza- 
tion is possible to other environmental settings), and task validity (the extent 
to which the selec'^ed measxirement task is representative of the external domain 
of interest or to other tasl.s sampled from the same domain) . These are much 
more powerful conceptions of validity than the limited and simplistic criterion- 
oriented methodology characteristic of appliec*. statistics. 



3 



The extent to which a neasure has the ^ame meaning or displays the same 
properties in the same or different groups under different conditions (inciud- 
ing different times) is an important empirical question. It is particularly 
critical in early childhood research and evaluation because progression through 
developmental stages may involve qualitative reorganizations of psychological 
di.'nensions/ thereby changing the meaning derived from the measures. Different 
processes may be involved in a task at different times — a test of nuiviber con- 
servation given at age 5 may reflect intuitive perceptual understandings, 
whereas at age 7 it may reflect concrete operational thinking. This is tha 
problem of continuity vs. discontinuity in measurement. Within this framework, 
then, it is very important for improvement of the field to focus not only upon 
changes of levels in performance over time but also upon changes in patterns 
of organization of these performances. 

Important Domains and Constructs in Child Development 

In the early childhood area, there are some key domains that require intensive 
examination to uncover or define salient constructs. In other cases, there 
are promising constructs that require fxirther elucidation. Third, there is a 
need to search out constructs that cut across domains and offer the possibility 
of explaining interaccive processes. 

some key domains that should be investigated are family procei'ses, language 
development, affective development, coping strategies, learning processes (as 
opposed to outcomes), and adult decision processes related to the care and treat- 
ment of children. It is not that some of these domains lack measures (there 
are a great many measures of "Ability to Cope with Personal-Social Demands") 
but that they lack the kind of conceptualization or theoretical organization 
that makes possible adequate assessment of the quality and meaning of the 
measures. 

Some prorr.ising constructs requiring further elucidation, especially at the 
early childhood level (much work may have been done on older populations which 
cannot be simply extrapolated downward) , include dimensions of creativity, in- 
telligence, and cognitive style. Othe* mportant constructs deserving further 
attention as a basis for sound measurement of young children are components of 
concrete and formal operational thinking derived from Piagei . 



ERIC 



4 

A major concern in the study of huxnan development is to understand the 
integration of differentiated subsystems in the child and# in particular f the 
interplay of cognition and personality* Investigations into this area should 
take account of three possible forms of this interplay: mediation, interaction, 
and transaction. Processes in one domain may mediate functioning and develop- 
ment in another ("dependency** may mediate the development of "analytical skills'* 
in cognition) ; variables in two or inore domains may interact to determine func- 
tion or development (cognition plus motivation may determine academic achieve- 
ment, unpredictable from either veur.^le alone); and an observed behavior may 
be so holistic in character that it represents a transaction d!iiong contributing 
variables of such a nature t:hat they can no longer be discretely identified or 
disentangled (a child's interrupting a teacher to ask a question may represent 
an aggressive act, an act of dependency, or an act of cognitive coping — or all 
of these things at once, in which caso. a more complex abstra'^tion is required) . 

Implications for Research and Development 

What are the implications of this stress on theoretically based measurement for 
those who fund and encourage reseeurch and develojsnent efforts in the early 
childhood area? There are several : 

1. Investigations involving multiple domains and multiple measures have a 
greater chance of advancing knowledge in the field than do studies of single 
constructs or measures, however global. 

2. Since policy decisions to initiate, enlarge, or terminate programs ar^ 
based on the results of particular evaluation studies, it is important that 
such studies include analyses of results across individuals, population groups, 
and situations- 

3. When new measiires are needed for research and evaluation efforts, pref- 
erence should be given to those that have been derived as part of a theoretical 
framework rather than to ad hoc developments. 



ERLC 



5 



Conclusions about Children and Their Environinents Will Stand or Fall on the 
Basis of the Adequacy with which Major Variables Are Assessed 



Just as we cannot necessarily extrapolate constructs frcan one age to another or 
measures from one situation to another, neither can we necessarily extrapolate 
assessment principles or test theory appropriate for older ages to tije measure- 
ment of young children • At the same time# we must not lower technical standards 
just because the subjects of study are young and because their functioning is 
less differentiated and more dependent situational influences. It is 
still essential to make infererces from the performance of young children to 
underlying personality and cognitive processes, and this requires as firmly 
based and well supported evidence as any other kind of measurement inference 
probably oven more so because of the child's greater susceptibility to contextual 
variations • 



ERIC 



six Major Weeds 

In applying assessment procedures to young children, there are six major needs 
that deserve special attention: 

1. The need for a systematic examj.nation of ciirrently accepte d test theory 

and principles and commonly" held assumptions to determine their applicabilit y^ 

to the assessment of voung children ; A good example of why this is necessary 

can be seen in the whole body of prescription and practice that has grown up 

around the concept of guessing on multiple-choice items. The frequently used 

formula S = R - -~ W (whore S = score, R = number of items right, W « number 
n-1 

wrong, and n = number of choices offered in an item) seems fairly sensible when 
applied to a population who liave developed out of their experience some specific 
strategies for test taki-ig ("If you can eliminate one or more choices as clearly 
wrong, guess; if you cannot eliminate any choice as clearly wrong, don't guess"). 
It is highly unlikely, howaver, that children of five or six would have developed 
ouch strategies. Siniilarly, we can expect relatively sophisticated test takers 
to recognize that the correct answer is equally likely to appear in any of the 
response positions. When faced with difficult items, however, young children 
without this insight are more likely to respond in terms of position biases or 
other types of response sets. 



6 



9, 



2 . The need to develop procedures specific to the measurement and analysis 
t it change ; Traditional psychometric methods employed in test and scale con- 
struction emphasize indices of internal consistency and stability. They seek 
items that maximally discriminate between individuals at a given point in time. 
!3cwever# these may not be the items that are optimally responsive to change 
processes occxirring as a function of development or educational treatment* It 
has been suggested that a new kind of psychometrics needs to be developed to 
handle the special problems associated with measurement of change. Special 
problems also arise in the analysis of change; we must go beyond the assessment 
of differences in level to investigate the possibilities of differences in 
structure that might signal c.ianges in the meaning of measures across time 
periods. Methodological investigations into the measurement of change are 
especially vital to research in early childhood. 

3 , The need to eliminate all irrelevamt measurement difficulty ; Requiring 
memory or reading abilities in a test of social studies compe*:ency may be per- 
missible for 12th graders because those kinds of ancillary skills are required 
at such a simple level that individual differences in them do not contribute to 
response variance. However, for yotinger children individual differences in 
such abilities are likely to be pronounced and would tend to contaminate any 
measure of their understanding of social studies. This kind of contamination 
has led to the fair charge that many achievement and ability tests are really 
reading tests in disguise. Other examples of irrelevant difficulties that may 
interfere with valid assessment inclxide a response procedure that is almost as 
difficult for the child to understand ai^ the problems posed by the test itself, 
or a time limit that is severely restrictive when the te&t task requires vary- 
ing amounts of reflection by the respondents. Slavish adherence to "standard- 
ized" administration procedures has sometimes been ir.ore of a detriment than a 
contributor to test validity. The important thing with young children is ro 
design test materials and arrange testing conditions in any way that will metxi- 
mize the likelihood that the cnild will 'jaderstand the task demands and respond 
along the dimensions intended by the examiner, dimensions intrinsic to the 
construct being investigated — in other words, tc ensure that the test task be- 
comes the child's task.- 



ERLC 

hfiiinniiinrfTnaaiia 



•7 



A . The need to ciatcli response requirements both to the task at iviml and 
t o thy relatively undifferentiated response system of the young child : At the 
infant level, of course , the problems of choosing meaningful response channels 
are exacerbated. (Ingenious investigators hava turned to dimensions of the 
orientation reflex, for exaii«>Xe, to obtain indicators of attention and informa- 
tion-processing abilities and consistencies in the infant.) At the same time, 
however, the ability of even very young children to respond in a variety of 
ways should be thoroughly explored emd not underestimated out of hand. 

5 • The need to ext end measurement standards not only to young ag e levels 
but also to non-test instrumentation s The use of the word ••test'* in the pre- 
ceding discussion does not imply that investigatora using ot^'sr forms of measures 
such as questionnaires, observations, and interviews are relieved of obligations 
CO demonstrate the adequacy of their techniques. However, these kinds of measures 
are not as typically supported with evidence on reliability and construct validity 
partly because investigators in these areas are Lot generally as Imacrsed in 
psychometric thinking and partly because questionnaires and observations apparently 
capture behavioirs in such a direct w>3ty that they arc sometimes taken at face 
value • 

6. The need to explore relationsh ips be tween unobtrusive measures and 
stand ardized test procedures ; Some general confusi'^n surrounds the attempts 
to avoid the prcllems of irrelevant difficulty by substituting unobtrusive 
measures, since procedures such as observation are sometimes misclassif ied as 
unobtrusive. There are really two dimensions leading to four coadrants of 
classification here: reactive vs. nonreactive (in terms of the measurement 
task), obtrusive vs. unobtrusive (in terms of the measurement context). Obser* 
vations are frequently nonreactive but obtrusive. (Indeed, whenever the investi- 
gator or observer is present on the s ie, problems of obtrusiveziess come to the 
fore; so also uiay problems of reacti.x^/.) Standardized situational tasks ob- 
served through one-way screens . reactive and unobtrusive. Trace measures 
such as -noseprints on the glass, or "worn-down tiles" are neither reactive nor 
obtrusive. 



8 



What needs to be done is to recognize how task variations along these di- 
mensions influence the kinds of inferences investigators are prone to make and 
what additional kinds of evidence may be required to support inferences in the 
different cases. For example, if a measure is blatantly obtrusive, what kinds 
of supplementary evidence need to be accumulated to ascertain whether the ob- 
trusiveness seriously contaminates the meaning of the scores derived? This 
may be viewed as a special case of the problem of method variance contaminating 
trait variance - 

The decision to use either natural or contrived settings often appeaurs to 
be a matter of the investigator's taste, when it should depend on the proposed 
use of the "scores" in subsequent analyses. If the obser<«-ationai measures are 
to serve as dependent variables, they should be derived from standardized 
sittiations. If they are to serve as independent variables describing the program 
or treatment, they may be derived from naturalistic settings, although valuable 
predictive (independent variable) information can also stem from 8t^mdardized 
situations. Confusion on this point may result in such anomalies as treating 
the number of questions a child asks in class as a descriptor both of the kind 
of educational process he is experiencing and the outcomes of the particul2ur 
educational treatment. 

It is important to add that in a systems view of the organism interacting 
with his environment, the labelling of variables as "dependent" or "independant" 
may not be as in^rtant as recognizing their interdependence. However, this 
view does not eliminate consideration for each variable of the logic of measiire- 
ment and experimental control. For instance, in the csxample of question asking 
given above, there ;x)uld be little hope of predicting individual consistencies 
in question-asking behavior from observations obtained in a naturalistic setting 
where children had widely varying opportunities arA occasions to ask questions. 

Implications for Research and Development 

Investigations should be launched into the appropriateness and properties of 
measurement methods as well as into the nature of the constructs being measured, 
and these enterprises should proceed simultaneously. Most important is the 
need to match methods of measiirement to both the characteristics of the constructs 
and the response capeUbilities of the subjects. 



9 



Assessment of Environments Is Important to Understanding the 
Function^?^f Individuals 



-chometric science grew primarily out of atten?)ts to measure characteristics 
of individuals, and the majority of measurement efforts over the years have 
been in this direction. Recently, however, there has been increasing recogni- 
tion that hmiian behavior is multiply determined by a variety of internal and 
external variables and that test renponses, being behaviors in the arall, are 
similarly multiply determined. Hence, if we are to understand sources of 
test variance and the constructs underlying test performance, we must give in- 
creased attention to tl>e context of that ptrfomance and the environmental 
factors influencing it. More generally this concern embraces both the assess- 
ment of the inmediate context in which the r-.-a8urement of the individual' takes 
place and the assessment of the broader environmental settings influencing 
educational and psychological developovent . 

Zt is assumed that the assessment of environmental variables should follow 
the same principles of construct measiirement outlined in the second section of 
this report. In other words, measure»ent development should be based on theory. 
However, most of the constructs underlying measurement of individuals are de- 
rived from psychological theory (particularly cornitive and personality theory) 
while those underlying measurement of environments are derived frca sociological, 
economic, ecological, and social-psychological theories. Interpersonal rela- 
tionships (including person-group and group-group interactions) are priaariiy 
the concern of social psychologists; interenvironmental relationships (recog- 
nizing overlapping environmental variables that impinge one on the other) are 
the concern of systems analysts and operations researchers. The interaction of 
people and environments is the growing concern of the newly evolving fields of 
environmental psychology and ecology. 

Gener al Environments 

Typically, individuals and environments are measured separately and their in- 
teractions are studied through research. Investigators ado.-cing this strategy 
are presently more hampered by lack of adequate measures of the environment 
O than of the person. 

ERIC 



10 



Some of the environmental areas of importance to child development where im- 
proved measurement is needed are dimensions of family process and socialization, 
educational programs, physical and spatial properties and constraints, and 
school and comwunity life. At the same time, there is a need to reexamine some 
of our conventional demographic measures of socioeconomic status; if, indeed, 
we are to conduct investigations across population groups, we must demonstrate 
the comparability of meanings of such measures for the several groups. 

In addition, some of the variables of person-environment interaction are 
coming to be conceptualized as constructs in their own right and this offers 
the intriguing possibility of measuring such variables directly. In fact, this 
is one of the nwst promising directions for future measurement research and 
development. It wuld permit us to take direct account, for example, of the 
possibility that each child in a classroom may actually be experiencing a 
different educational program and that each sibling may be living in a different 
hotae environment. Furthercore, we may have to recognize and measure certain 
processes that mediate between the Individual and his ei»vironment8, as in the 
study of social perception and personal space. In many cases, we may miscon- 
strue the nature of rela^.onships derived solely from measxirea of the individual 
and of people and environments that impinge upon him or even from direct measxires 
of the interaction among them? we may have to measure the person's perception 
of these other people and environmental characteristics and interpret the inter- 
relationships and interactions from the standpoint of his perjional conotruction 
of the wrld. 

The Assessment Environment 

We are sometimes interested in the context of assessment primarily to identify 
possible threats to the validity of assessment results. This concern is 
especially pertinent to interpretation of measurement results obtained with 
very young children. While relatively wide variations in testing conditions 
and settings may have very little effect on the test performance of adults 
(especially if the assessment relates to their motivatione or aspirations) , 
they can drastically alter the performance of children. At least investigators 
must devise mothods of assessing these test-condition variables. (This is in 



11 



addition, of course, to attempting to devise facilitating and positive contexts 
for testing.) Some of the kinds of context assessment that are important here 
are interpersonal (child-examiner or child-child if there is more than one child 
in the assessment setting), personal (including the child's response styles 
and feelings of adequacy in coping with the task demands) , environmental (in- 
trusive external events) , temporal (how long does the assessment take?) , 
physical (room arrangement, heat, light, and so on), and examiner-based (the 
examiner's characteristics and administration styles). Just as we need ccan- 
parability of m easiorement of constructs across investigators to permit accumula- 
tion of knowledge and impact, so do we need comparability of methods of assess- 
ing the context of assessment to compare results investigators get using the 
same measures. 

Implications for Reseaurch and Development 

Priorities in the area of environmental assessment in child development include 
attention to direct measures of interaction and assessments of mediating processes 
as veil as measures of coinnon "main effect" environmental variables. In addition, 
it is important to docxanent the immediate context of assessment to clarify 
possible influences on scores that may require qualifications of inferences and 
generalizations . 



er|c 



I 



12 



Action Systems Are Needed for the Effective Utilization of 
Research and Measurement in Educational and Social Applications 



One of the perennial difficulties in dealing with educational and social problems 
is moving from research and development to practical applications of its ideas 
and materials — applications that are practical in econcwaicr political, organiza- 
tional, and hximanistic terms- The most sophisticated approach to these diffi- 
culties is one that develops action systems that include essential components 
and takes :.ccount of all of the interests and concerns of the various parties 
to the enterprise. In education this means recognizing explicitly that approaches 
that do not meaningfully involve teachers and piurents in developing the goals 
of a new curriculum project are tinlikely to succeed • 

What kind of action systems would be appropriate to carry out the ideas 
that result frcMn research in early learning? An example of a co*aplex system 
very much desired by those concerned with early learning disabilities would 
involve the following components: 

!• An assessment battery,, well-cyrounded conceptually and valid in terms 
of its predictive consequences > to identify children likely to experience edu- - 
cational problems and to diagnose specific deficiencies and prcficiencies > 

2. Guidelines for interpretation of battery results at a level of com- 
plexity appropriate to the phenomenon of interest; The discrete pieces of 
information from the assessment may be combined in various ways, dependiiig upon 
the identification/diagnostic needs. In some cases, combinations of weighted 
scores may be sufficient; in others, the important thing would be not so much 
level as pattern of, and discrepancies in, performance. 

3« Treatment specifications and prescriptions based on the assessment 
results or patterns for individual children ; The determination of relevant 
3i>ecifications and prescriptions for appropriate programs that results from 
the diagnosis of deficiencies and proficiencies must itself be the result of 
extensive research and development efforts. This is probaJaly the most impor- 
tant missing link in the child development field and should be given tha highest 
priority, 

4. Procedures for ^periodic monitoring of the progress of children in the 
programs and for evaluation of t^ e effectivenes s of the treatments : These 
gl^Q procedures should include some of the same instxximentation used in the initial 



13 



assessment battery. In addition, they may include assessments of reactions to 
children's progress and to the programs by parents and other concerned groups. 

''i . Correction mechanisms keyed to the results of component 4 (above) to 
enable (a) new treatment prescriptions for children as predictions and diagnoses 
change and (b) modifications of treatment specifications to try to improve them t 
Because of the rapid developnental changes — both qualitative and quantitative— 
that are likely to occxir, such a recycling component is vital in a systen de- 
signed to serve the educational needs of young children. 

Any action system should contain within it, from the very outset of its 
implementation, this kind of provision for periodic collection and analysis of 
evaluative infoarmation in order to effect its improvement. In some cases, this 
means identifying changes in conditions that might require program modification. 
In addition, if evaluative information is positive, it can be used to justify 
the continuation of a program or, if negative, to nodify or terminate it on a 
rational basis. The inclusion of cost-ef fectivenes*; information in a programs- 
evaluation model increases its utility for these purposes. 

Prograun evaluation within a construct framework, if sufficiently system- 
atic in design and execution, can qualify as research on educational process 
with the potentiality for contributing to the advancement of knowledge about 
child development and practice that that implies. 

At a less ambitious level, it has been suggested that it would be of great 
service, especially to local educational plimners, to have access to a kit of 
measures from which they might choose instrxaments to try out in their own 
action systems. The measures in the kit would be selected by experts from 
fields concerned with the assessment of young children and their environments. 
The experts would employ selection criteria related to such characteristics as 
construct validity, other kinds of validity, reliability, adequacy and clarity 
of administration directions, availability of related equipment, and compre- 
hensiveness of the total collection. The last reflects the major point already 
made in this paper about the importance of multi-domain, multi-measure investi- 
gations, where domains include the psychological, physical, and sociological 
and the measures include tests, naturalistic observations, and questionnaires. 
The )>rinciple applies as much tc action systems as to research studies. Any 
such kit effort as that described above would be useful ove. time only to the 



14 

extent that provision was made for periodic updating of its contents, includ- 
ing elimination and addition of measures. 

An important principle distilled from Seymour Sarason's book, The Culture 
of the School and the Problem of Change ,* is that whenever an attempt is made 
to improve or change a social enterprise involving several interested parties, 
and this attempt takes into account all of the vested interests but one, the 
neglected party will rise up in an organized fashion to destroy the effort • 
Nowhere is Sarason more likely to be proved right them in as socially and 
politically sensitive an area as one involving the measurement of young 
children. 

Preventing the effort from being scuttled, however, is not the only reason 
for involving all interested parties in such an enterprise • Actually, their 
contributions to the conceptualization of measurem<int-related problems and the 
selection and application of measures can make the results of those processes 
more meaningful. Teachers know better tlian any other group what educational 
actions they have the facilities and resources to undertake following the diag- 
nosis of children's educational needs. Parents know better than any other group 
what educational aspirations they have for their children. Both teachers and 
parents frequently know better tham anyone else what kinds of materials and 
situations the children are likely to respond to. 

Using Standard-related Measures 

A significant movement in educational measurement today is away from inter- 
pretation of test performance in relative (normative) terms to interpretation 
in terms of standards of acceptable or desirable performance- Leaders of this 
movement use various terms such as "criterion-referenced measxirement, " "domain- 
referenced measurement," and "measurement for mastery." What they are ail 
saying is that for purposes of improving a child's performance it is more im- 
portant to know where he stands relative to standards of accomplishment than to 
the performance of others (although the latter may provide significant signs of 
potentialities or possible problems in his development, amd such signs are 
especially important at early age levels) • 



* Sarason, S. The culture of the school and the problem of change . Boston, 
t4ass»: Allyn Ct Bacon, 1971. 



15 



The use of such standard-related measures is especially pertinent to 
action programs because it can be tied directly to educational presf ription 
and intervention. Moreover, the emphasis upon standards highlights the ne- 
cessity of confronting the value issues of what is "good" in a pluralistic 
society or whose standards vill orevail. When a local program undertakes the 
development or use of such measures, the involvement of ail the interested 
parties takes on special urgency and significance. They must be involved not 
just in the standard- setting process but also in the prior identification of 
the goals that the standaurds are meant to serve and the means whereby attempts 
will be made to reach them. 

It should be emphasized that there are research as well as operational 
implications of the criterion-referenced thrust. Work needs to be done on the 
technica? properties of such measures and their relationship to construct- 
based assessment. Furthermore, it needs to be generally recognized that the 
development of criterion-referenced measxirement in contradistinction to norms- 
referenced measurement does not mean that normative information is not valuable. 
Indeed f it is unfortunate that the two measurement approaches are frequently 
viewed as a polarity, for not only can they usefully supplement each other but 
normative considerations, albeit usually implicitly, ofter underlie the choice 
of instxument content as well as the performance standards set for criterion- 
referenced tests. 

Consequences of The Program and Effects on The People Involved 

An area of special concern — and difficulty — in measurement programs is the 
effects of the measures and the accoiqpanying decision processes on the people 
involved. These are, of course, inextricably intertwined because the effects 
of the measurement on people are frequently mediated by the decisions and 
actions of other people. In the case of the effects of measurement on young 
children, however, the "other people" have the unfair advantage of being 
larger and stronger! 



Consider, first, the decision makers in measurement programs. Stress 
already been placed on both (a) the essentiality of involving key parties 



16 



the decisions about what constructs are to be assessed and how they are to be 
assessed in terms of the purposes of the program and (b) the importance of in- 
sisting that the assessment conditions and materials wixl not involve irrele- 
vant difficulties, make some individuals feel anxious , threatened, or alienated, 
constitute invasions of privacy, and so on. 

But there are some prior and overriding questions the decision groups 
must face, including the purposes and social consequences of the program and 
any assessment that is required for it. Furthermore, before the program is 
put into operation, they must specify the uses and limitations on uses to which 
the measurement results can be put. Typical education uses relevant to young 
children include instructional guidance of individuals or groups, evaluation 
of the effectiveness of an innovative program, and curriculum or program re- 
vision. Typical misuses th^c must be guarded against frcan the outset derive 
from misconceptions of the phenomena being measured ("intelligence'* is a major 
case in point;, exaggerated expectations about the infallibility of tests, 
tendencies to take seriously insignificant differences between scores, inju- 
dicious presentations of results (in forms not directly related to the needs 
of the teacher, counselor, or other interpreter) , and desires to make data 
collected for one purpose serve other purposes for which it was not intended 
or particuxarly appropriate. 

Tests and other measures can have both positive and negative effects on 
those who take them, administer: then, and interpret then. Tha takers in early 
childhood projects are usvially the children, and frequently their parents and 
teachers as well* Taking a i^st s ho aid not be M Q^pleasant c^iperience for a 
child. In fact, if t^e measure ir> a^^xvpri;^' f^lv dBsiqmdf the activity can 
rewarding and even fun* Horeover, eoM tmL^i, :c^^'«. ^♦^r;^ r-txTVidr. <>n ex- 
cellent opportunity for the teacher or otlier \^dpt.... . m obj&«ive a child 
intensively and study his reactions asu3 coping helMvie:^* for Inftightft this 
information may provide for future educational efforts. la addition, a good 
assessment battery can do much to pronote ascinQ teachers ^^nd others considera- 
tion of the complexity of children and the broad rMgm of skills, attitudes, 
social competencies, and so on that character ises children's deveioTpment and 
underlies their responses to educational and social stimuli* Experience with 
construct-based measures can enXiance understanding of the co'-strvcts on which the 



17 



measures are based. Similarly, a good questionnaire can incj:ease a mother's 
consciousness of factors, including values, important in her and her child's 
life. 

There are numerous £;uch examples of possible positive effects of measures 
on those involved with them, and the list of possible negative effects in- 
cludes anxiety, stijtmilation of over -competitiveness , and invasion of privacy. 
The point is that there is a serious need for continuous consideration of po- 
tential social and personal consequences in ejiy proposed use of meastireinent . 
These ethical issues must be scjuarely faced as an integral part of decision 
making in meastirement research and application. 

Implications for Research and Developoerit 

In this section dealing with action programs inwlving measurement of young 
children, che major principle is that processes of decision making about uses 
of ir.easxirement should occur withi.r. a rational frame-work that includes attention 
to: 

1. The interdependcr.cies tr.-e cowpcnents of the action system. 

2. The priorities ajvi ill of the parties to the enterprise. 

3. Provision for eval\:ative :^fcr»aticn for the improv«nent and adapta- 
tion of the system. 

4. Possible measureser.t side effects (negative and positive). 

5. The decision processes themselves. 

6. The ethical basis for the assessment (and the system) in terms of 
personal and social consequences. 



18 



i:ffoctivo Measurement Development, Rosearc!\, Administration, and 
utilization Are Dependent on Availability and Training of Manpower 



The current CCD interest in the establishment of a new profession of Child De- 
velopment Associate (CDA) is a recogniti'^n of the shortage of trained person- 
power to assist in Head Start r Parent-Child centers, and other programs 
dedicated to serving young children in the United Stat^^s. In terms of the 
focus of this report, this shortage is felt especially in the area of admin- 
istering assessment instruments. It is recommended, therefore, that special 
programs be developed (in relationship to the CDA effort or otherwise) to train 
people in the skills, sensitivity, patience, flexibility, and humor that good 
administrators of measu^rei^ for young children must have. 

This is not an easy reccnnmendation to implement • Wisdom and economics 
are on the side of using testers from the same communities as the children 
being tested, which implies a nationwide training effort • It is difficult and 
time-consuming to train people in assessment skills that may have to be applied 
to a variety of situations and instruments, and include skills in administering 
measure! to parents, teachers, and other adults in children's lives as well as 
the children themselves. Perhaps, after initial training, periodic refresher 
courses or short-term courses to train in new measures would be required. 
Furthermore, not many people in any one community can expect to make testing 
a full-time occupation. Therefore, it is important that people be trained in 
other skills as well that will make them useful in a wide range of child 
services. 

Manpower training programs also need to be developed in the instrment- 
development process or "art,** as it is sometimes described. As we have sug- 
gested elsewhere, there is a far-frcMti-perfect correlation between knowing 
what to assess and knowing how to assess it. Development of instrumentation 
for young children presents unusual problems that standard university tests 
and measurements courses do not usually cope with. 

The various applications of measurement in relationship to child develop- 
ment require different mixes of expertise and experience, The researcher, 
evaluator, administrator, diagnostician, and teacher all represent specialized 
ERXC ^oles, and, while tnany individuals frequently eu:e able to play several of them. 



19 



we must recognize that it is also possible and sometimes quite efficient ro 
have different measurement-related tasks handled by different individuals 
trained in the specific mix of skills required. This is not to imply that 
their training in assessment should be separate from the other aspects of their 
professional training. Rather, it might be better to eosbed assessment in their 
total cxirriculum. However, investigations should be made into the best methods 
of increasing assessment-related skills and knowledge through existing or new 
structures* 



20 



The Advancement of Measureioent Research and LeveiopiRent in Early 
Childhood Depends upon Appropriate Policy Decisions 



lERiC 



The responsibilities for establishing supportive policies and atmosphere are 
shared by public agencies p private agencies, atnd individual professionals Com- 
mitted to research, develoixnent, and evaluation m the early childhood field. 
The agencies are asked to becciaie well enough acquainted with the field of 
measurement in early childhood— and such reconanendations as those inc3*ided in 
this report — to appreciate the need to support coranitted researchers over time. 
"Ccxnmitted" is related to the central importance of having measures firmly 
grounded in constructs and theories. "Time" refers botli to the time this kind 
of effort takes and to the time necessary to allow children's developmental 
sequences to occxir and i>e observed and studied- At the extr^e, it can be 
mentioned that some of cur best-knovi> and respected psychological measures 
represen*- a product of all or a large part of the careers of prominent investi- 
gators . 

Of course, it is frequently important to know whether a program is working 
and it may be impossible to wait for several years to find out. Even in such 
urgent situations, however, it is essential to provide enough time and support 
for sound instrument development/selection and the necessary accompanying ra- 
tional processes. Otherwise, the report of the investigation, however prompt, 
can lead to wrong interpretations and unsound policy decisions. 

On the other hand, individual invest igaft^rs must not undertake sponsored 
research and evaluation studies for which time and resources are ir^adequa^ic 
And, when they can document their positions, it is important for thw to be 
able to count on moral support from their institutions and professional organi- 
zations. This implies, of course, that they have been active in educating 
their institutions and developing orgsmizational positions 2Ux3Ut the require- 
ments for sensible research and evaluation efforts. Ac the same time, however, 
.i^v.e&ti9artca''S must also come to appreciate that in a time of pressing social 
problems and rapid social change they no longer hav2 the auton^xny of time that 
some of them previously enjoyed. The point is that a workab*" balance must be 
struck, but the major problems at this stage appear to derive more fr^i thought- 
less action than from actionless thought. 0 



21 



Sofiie of these problems can be avoided if agencies adopt a policy of sup- 
porting "targeted** research in an area^ as opposed to directed invest igatioixS. 
The HPP route generally se^s more applicable to the world of defense contracts 
t^ian to the world of social science research. 

Govemaient and priv.^te agencies have generally not been inclined to 
support measurement developaent itself. Rather they have supported research, 
and evaluation efforts thcit ba^e ir.cluded some instrument developoient . Viewed 
frorfi the perspective of the first major recoBweiidation in this report— for 
measurement davelopmen. pursued as part of a theoretical framewrk—this general 
ctrategy is probably quite wise. However r in soae instances, moving a promising 
instrument or measurement technique frcan the research setting into the field 
demands specific adaptation ard development efforts. And it may very well^ba 
the case that the researchers who initiated the instrumentation or technique 
are not the best people to ready it for operational use. In a\;ch instances # 
it %#ould be appropriate to support the further developaaent work in its own 
right, at the same tim^ according special re'^^psct rx> tlKSise whose talents lie 
in the direction of eliminating irrelevant difficulty, adjusting stimulus and 
response requireaents to the stabject and the porpose of the measurement, and 
in other %wiys ensurir»g that the measureiftent task becomes the task of the indi- 
vidual who confronts it. 

The atmosphere for measurement research and development can bo strongly 
influenced by two kinds of procedural routines-- those having to do with review 
and with dissemination. In the first case, an agency's responsibility i£ to 
ensure that reviews are professionally sound and that the purposes of the 
process are fully explained to the reseatrchers and evaluators whose work is 
being reviewed. Such reviews should have a formative and constructive compo- 
nent. If project reviews appear to searve only censorship or manipulative func- 
tions r they m3.y have the ^ rect consequence of limiting the direction of the 
investigation for polit ^il (or nonresearch) reasons or the indirect consequence 
of fostering so much adv ^ce concern that they lead to undue self-censorship to 
avoid possible difficulties- These statements are not to be construed as an 
indictment of external reviews of projects, procedurei5, and instruments; re- 
views are important and necess*ir^ to scientific inquiry. Ijjgever, attention 
must be paid to making review pro.^edures posit ve rather than punitive to keep 
from endangering the very investigations they are designed to serve. 



22 



Science thrives on public disclosure of Its results* Any policies Which 
soem to prevent or delay publication of the reports of Investigations undertaken 
under agency auspices are vle%^ with alarA by most Investigators. Many properly 
refu«e-to widwtake-iwojects^whett -thexeaultsi are Intended for the sponsor's 
eyes alone. However, a more frequent problem In the ««a of dissernlnatloHnKfir 
to do with making reports and products of sponsored Investigations widely 
avail2tble. Coranerclal publishers and distributors — if they are good—are 
generally considered more able to ensure national publicity and dissemination 
than private or government orgwizations. However, ways must be found to over- 
come seme of the copyright, •'public domain,** and royalty problems that have 
inhibited their perfcnnance of services in recent years* And working relation- 
ships have to be developed among reputable comnercial organi2:ation6, investi- 
gators, and sponsoring agencies to stimulate the dissemination of promising 
products. The possibility of such avenues can greatly relieve the frustrations 
of researchers who have often in the past felt that some of their best ideas 
and inventions %iere sentenced to gathering dust on a shelf. 

Finally, in the policy domain, we need to enqphasize that the Involvement 
of all of the parties with vested interests in the enterprise is just as im- 
portant in research and development efforts as it is in action programs. This 
means that if research and development in early childhood is to focus on a 
particular minority group, every attaaqpt shoul3 Be'maTdc-^to -^invoiva xesearchex^^^^ 
who thoroughly understand the problems of that group. Gxich Involvement could 
range from minority-group direction of a project to collaboration to consulta- 
tion, depending upon circximsteuices of time and available expertise. A sponsor- 
ing agency's obligations in this area include special efforts to let contracts 
for minority-group research to minority-group organizations and active encourage- 
ment of collaboration between minority, researchers axvi other research organiza- 
tlcms. Minority-group organizations have a concomitant obligation to keep 
informed about likely sources of support for investigations of special interest 
to them, to propose appropriate research and development efforts, and to be^ 
willing to offer their collaiboratlve and advisory services to other research 
and development groups. 



ERIC 



23 



Implications for Research and Development Atmosphere " 

This section calls for conscious attention to the possible influences on the 
atmosphere for reseeach and development of policy decisions in such areas as 
time and resources for investigations, amount and kind of external direction , 
types of projects to be supported, review and dissemination procedures, and 
involvement of relevant groups. 

The policy-making process, it should be emphasized, has two distinct 
consequences: one the intended regulative effect and the other a change in 
the evaluative context or atmosphere of the regulated domain. This change in 
atmosphere affects the way people look at things, the details they select for 
enqphasis, the interpretations they favor, and it thereby helps to determine 
the values of the future. 



24 



Suinnary of Recommendations 



1. ir measurement is to serve a practical purpose in the study of young chil- 
dren and their environments, its use must be justified in terms of social 
consequences and these cannot be evaluated without under stamding the meaning 
of wtot is being measured. This understanding is possible only if measurement 
development is carried out within a theoretical framework. 

The central task in all measurement, as in all science, is one of gather- 
ing evidence to support a theoretical explanation of phenomena observed in 
the empirical world. In psychometrics, this is the task of construct validity. 
Inherent in the construct validity approach to measurement is the notion that 
a variable cannot be measured in isolation. To find the meaning of a measure, 
one must examine the ways in which it relates and does not relate to other 
relevant measures. Thus, any investigation of one measure ra\ist involve others. 
Moreover, one must investigate how that measure functions in different situa- 
tions and across different cultural groups. 

For these reasons, it is important that investigations of childhood meas- 
ures involve multiple measures and multiple domains. Further, since policy 
decisions to initiate or terminate programs ore based largely on the results 
of evaluation studies, it is important that such studies include analyses of 
results across individuals, population groups, and situations. Finally, when 
new measures are needed for research and evaluation, preference should be given 
to those that have been developed as part of a theoretical framework rather 
than to ad hoc developments. 

2. Oirrent methods of measurement that have been found to be appropriate 
for older age groups cannot necessarily be applied to the assessment of young 
children. Most test-taking strategies that have become part of the older 
student's jnental repertoire are unknowi to the child of five or six. For the 
young child, many achievement tests that are designed to measxire competence in 
specific subject 2treas axe contaminated by reading and memory requirements. 



25 



Further problems are posed by response procedures , which are often as difficult 
as the measurement tasks themselves f and time limits that are severely re- 
strictive. 

In addition to studying the nature of the constructs being measured, 
studies should be conducted to investigate currently accepted methods of meas- 
urement to determine ways of designing test materials and arranging testing 
conditions to ensure that the test tasks become the child's tasks. At the 
same time new kinds of measuring techniques may have to be developed to capture 
the complex behaviors of young children over time. 

3. Children's responses to measurement tasks are influenced by many 
different factors both external and internal. If we are to understand chil- 
dren's responses to these tasks and the constructs being measured # we oust give 
increased attention both to the context in which the measurement of the indi- 
vidual takes place and the larger environmental factors that influence the 
child's development. 

Some specific areas of importance that should be investigated include 
dimensions of family process and socialization, educational programs, physical 
and spatial properties amd constraints, and school and conmunity life. In 
addition, there are a number of other factors having to do with the relation- 
ship between the individual and his environment that are important in their 
own right and that we may soon be able to measure directly. 

Meanwhile, investigators must devise methods of assessing those aspects 
of testing conditions and settings that contribute to variations in assessment 
results among young children* 

4. Effective action systems are required to make it possible to apply the 
ideas of research and development to practical needs in the field. A model 
system might include such components as these: 

- An assessment battery, well-grounded in theory and valid in its pre- 
dictive implications, to identify problems emd specific deficiencies and pro- 
ficiencies. 

- Guidelines for interpreting results of the test battery. 



26 



. Specific treatments and prescriptions based on assesfoaent results. 
. Periodic inonitoring of children's progress and effectiveness of the 
treatments . 

. Procedures that would permit new prescriptions or modifications of 
existing ones if predictions and diagnoses change. 

The success of all such programs will depend on the extent to which they 
reflect the priorities and goals of all those who are involved in their cre- 
ation and use. 

Those who assume the responsibility of translating the results of research 
and development into action programs such as those described above must also 
assume the responsibility for the social consequences of such programs in the 
communities they serve. 

5. There is a clear need today to establish programs to train people in 
the personal and technical skills that Lze necessary in the administration of 
measures for young children. Such training should cover a wide range of child 
services and include provision for periodic refresher courses. Programs are 
also needed to train people in the development and application of instruments 
in child development enterprises. 

6. If they are to create the proper climate for the advancement of measure- 
ment research in the field of early childhood ^ public and private agencies 
should become well enough acquainted with the field to support committed re- 
searchers for as long as they need to create measures based on carefully thiought 
out constructs and theories. 

Many problems of time and money can be avoided if agencies adopt a policy 
of supporting targeted r<>8earch in an area instead of attempting to direct in- 
vestigations . 

Although government and private agencies have in the past been inclined 
to support research and evaluation efforts that included measurement develop- 
ment rather than the development o*" measurement itself, it may be necessary to 
support such efforts in order to move promising techniques from the research 
laboratory to the field. 



4^ 



27 



Appropriate policy decisions pertaining to review of investigations and 
dissemination of products and reports are also impcrtcmt to creating an 
appropriate atmosphere for measurement research and development. 

Finally, it should be en^hasized that involvement of all the parties con- 
cerned with a project is every bit as important in research and development 
as it is in the establishment of action programs* If research focuses on a 
particular minority group, every attempt should be made to involve reseeurchers 
who imderstand the problcans of that group. 



Appendix: Specific Panel Reconmendations 



31 



Coostmct-'based aeasureaient development and re s earch 

Measurement development pursued as part of a theoretical framework. 

Systematic simultaneous assessment of Individuals and environments. 

Longitudinal or developmental assessment of the changing organisation 
of capacities, not just linear accretion in them. 

Identification of constructs that are coonoon to different subject groupr 
but may need to be measured with different content and methodology. 

Researdi to relate cognitive styles to functioning in the educational 
situation. 

Assessment of ability to utilize skills, not Just possession of them. 

Instruments related to the child's 2Ut>ility to organize the environment 
— cognitive and affective; e«g. , sense of competence, confidence 
in ability to cope, ability to tolerate failure, ability to aiply 
alternative coping strategies, learning how and when to learn, 
internal locus of control. 

Instruments in such universally important social-emotional areas bs 
empathle abilities; tolerances of differences in appearance, think- 
ing, etc.; feelings of competence; willingness to initiate ::iCtions. 

Measures of representational ability in order to be able to deal with 
hindsight and anticipation (a mediating facility, as Piaget might 
say). 

Good measures of children's communication processes. 

Continued pursuit of differential assessment of different aspects of 
language development. 

Development of early detection tools (school skills, minimum CNS) 
sensitive and specific to dysfunctions and specific learning 
disabilities for two critical ages, 2 1/2 and 4 1/2. 



32 



Technical characteristics and adequacy of measurement 

Development of measurement standards particularly appropriate to 
assessment of young children and their environments — standards 
for instrument developers and users • 

Research into the methodology of assessment of young children, with 
emphasis on variations in assumptions and theories as a function 
of subject age and culture. 

Examination of the ecological validity of measures before extrapolation 
of program recommendations to other settings and groups. 

Standardized situational "tests'' to stipplement information obtained 
from more conventional tests. 

Disentanglement of the uses of observational measurement of independent 
and dependent variables; the former can be obtained in naturalistic 
or standardized settings, but the latter requires standardized 
settings and tasks. 

Assessment of children's ability in ways and settings that engage 
realistic processes — especially vital in assessment of functions 
at a concrete as opposed to a formal level. 

Assessment of cognitive skills through non-reading modalities. 

Investigation of possible cultural bias or boundedness in construct 
definition as well as in measurement. 

Analysis and reporting of possible "order" ellects attributable to the ^ 
arrangement of instruments in a battery. 

Recognition of the "richness" of Information that may be obtained from 
a measure — not just conventional scores but other potentially importani: 
data such as response sets, distractibility , etc. 

Development of adequate "practice" materials for tests designed for 
administration to young children. 

Investigations into the usefulness of both "limit" and stundard testing 
procedures in the same setting; discrepancies between a childV ^^r- 
formance under the two conditions may have important clinical ana 
educational implications. 

Provision for validation of constructs Across settings — research, re- 
medial, clinical, etc. 



33 



B. Technical characteils tics ana adequacy of measuretaent (cont.) 



Houtine investigations of administrator-variance when tests are 
moved from one setting (e.g., research) to another (e.g«> educa 
tlonal prflgraa)— -is the test author the only one who can get 
certain results? 

Development of a taxonomy of valid and reliable responses that can 
be obtained irom children from 0 to 9 to measurement tasks. 



C. Con ceptuali zati on and measurement related to childre n's en vironments 

Environmental measures, both specification of properties for measure- 
ment of specific environmental variables and instrumeutation for 
universal dimensions that cut across specific environments (e.g*, 
those that have to do with time coerciveness) . 

rieasures capable of describing dynamic as opposed to static processes 
in the child's interactions with his environment. 

Measures of children* s experience in context (their "individual" 
educational programs). 

Improvement of instruments used to gather demographic data (e.g., SE?) 
and determination of comparability of meaning across population 
groups . 



34 



Action systems Involving measurement 

Development of strategies for the simultaneous selection of measure- 
ment variables and identification of program needs and for 
establishing research, development, and evaluation priorities; 
one strategy might involve emphasis on the overlap between research 
and consumer priorities and comparisons of treatment effects for 
different populations. ^ 

Provision for taking account of consumer needs and values in con- 
ceptualization of measurement-related problems and in the development, 
selection, and application of measures; consumers include those 
directly responsible for the welfare of the children. 

Consideration of prescription as a necessary sequent to evaluation, 
understanding, and development of H range of alternative program 
options (to challenge the consumer to rational choice/. 

Focus in assessment interpretations on individual differences and intra- 
pattem analyses, as opposed to group diffeietices and comparison^. 

Investigation of obtaining diagnosis and prediction information from 
a single set of assessments. 

Observational procedures suitable for monitoring the installation and 
implementation of an edu aticnal innovation. 

Assessment of ability to utilize skills, not just possession of them in 
terms of abilities in vocal, pantomime, recorded (reading and writing), 
and mathematical/scientific langviages predicted from auditory/vocal 
and visual/fine-mctor skills; determination of the relationships be- 
tween such discrepancies and social, emotional » and cognitive problems 
rf children. 

Tests that describe capabilities and limitations for which some "treat- 
n>ent" can be prescribed (e.g., criterion-referenced tests), as opposed 
to tests interpretable only in normative terms. 

Selection, by experts, of a multi-measure, multi-domain "kit" or collection 
of measures from which instruments can be selected for tryout at 
local levels • 

Survey of the actual educational decision-making processes that attend the 
assessment of young children, for possible Insights into improving them. 

Inclusion of a search for possible side-effects (positive and negative) 
of measures on young children. 

Investigation of problems associated with "labelling" as a consequence 
of administration of certain child instruments. 

Recognition of and capitalization on the positive side-effects on teachers 
of participating in instrument selection, administrr .ion , and inter- 
pretation. 



35 



Action systetns involving measurement (c onu) * 

"Job analyses" of typical school learning tasks aa an aid to program 
and instrianent developmen t /select Ion » 

Development of self-correcting, uniform (computer-compatible) decision 
trees which display the decision process in the selection of 
teaching strategies matched to ability profiles of groups and 
individuals • 

i 

study of the effects of overt and hidden cognitive skills and handicaps 
(and patterns thereof) on the child's scholastic achieveiaent , 
social adjustment, emotional adjustment, and the family's satis- 
faction with school performance and the child's performance; study 
of the specific teaching strategies that are effective with children 
of different skill-handicap patterns. 

Estimation of potential cost benefits of^elf-correctirg diagnosis-- 
treatment evaluation systems related tf|tdy»f unction and specific 
learning abilities at early education levels, especially in com- 
parison with the costs of such current programs and practices as 
"non-promotion," Right to Rsad Programs, Drop-Out Preventloa Pro- 
grams » and Special Remedial Programs. 



Manpower development and training 



Training procedures and systems for testers and other gatherers of data 
about young children and their environments. 

Development of subprofessional manpower to serve dual roles in 

individualized data acquisition, translation, and feedback processes 
and to act as "teacher assistants." 



Improved and specific ttalning in development of instruments for young 
children. 



36 

Research an d development policy and atmosphere 

Support of cocmoitted researchers ovet time — time sufficient to deal 
with the complexity of construct assessment and for developmental 
sequences to occur* 

"Targeted" but not directed stimulation of research and devel^' c«nt 
in early childhood assessment and research. 

Special agency efforts to let research and development contracts to 
minority groups and/or to encourage collaboration between minority 
H||tearch groups and other research organizations. 

Agency appreciation of the time it takes for rational processes and 
instrument development in research and evaluation efforts in 
early childhood; concomitant professional acceptance of the 
responsibility not ' ^ undertake government-sponsored research and 
evaluation vithout adequate time and resources. 

Development of specific research and developu^nt priorities related 
to measurement of infants. 

Modification of agency policy (if necessary) to allow for direct 
support of instrument development, especially the application of 
measurement e?:pertise to promising conceptually-based research 
instruments. 



